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Abstract 

Abduction is one of the most important forms of reasoning; it has 
been successfully applied to several practical problems such as diagno- 
sis. In this paper we investigate whether the computational complexity 
of abduction can be reduced by an appropriate use of preprocessing. 
This is motivated by the fact that part of the data of the problem 
(namely the set of all possible assumptions and the theory relating 
assumptions and manifestations) are often known before the rest of 
the problem. In this paper, we show some complexity results about 
abduction when compilation is allowed. 



1 



Contents 



1 Introduction 2 

2 Preliminaries 5 

3 Complexity and Compilability 10 

4 Compilability of Abduction: No Ordering 14 

4.1 The Method 14 

4.2 Existence of Solutions 15 

4.3 Verification 18 

4.4 Relevance, Dispensability, and Necessity 19 

5 Compilability of Abduction: Preferences 19 

5.1 Some General Results 20 

5.2 Verification 21 

5.3 Relevance, Dispensability, and Necessity 22 

6 Compilability of Abduction: Prioritization and Penalization 22 

6.1 Verification 23 

6.2 Relevance and Necessity 25 

7 The Horn Case 28 

8 Conclusions 31 



1 Introduction 

Deduction, induction, and abduction [Pei55] are the three basic reasoning 
mechanisms. Deduction allows drawing conclusions from known facts using 
some piece of knowledge, so that "battery is down" allows concluding "car 
will not start" thanks to the knowledge of the rule "if the battery is down, the 
car will not start" . Induction derives rules from the facts: from the fact that 
the battery is down and that the car is not starting up, we may conclude the 
rule relating these two facts. Abduction is the inverse of deduction (to some 
extent [MF96]): from the fact that the car is not starting up, we conclude 
that the battery is down. Clearly, this is not the only possible explanation 
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of a car not starting up. Therefore, we may get more than one explanation. 
This is an important difference between abduction and deduction, making 
the former, in general, more complex. 

While deduction formalizes the process of drawing conclusions, abduction 
formalizes the diagnostic process, which attempts to invert the cause-effect 
relation by inferring the causes from its observable effects. The example of 
the car shows such an application: complete knowledge about car would allow 
finding (i.e., abducing) the possible reasons of why the car is not starting up. 
The following example shows how abduction can be applied to formalize a 
diagnostic scenario. 

Example 1 While writing a paper with some authors located in another 
country, you get a set of macros that are used in a nice figure they drew. 
However, when compiling the .tex file, an incomprehensible error message 
results. Four explanations are possible: 

a : the macro has been used with the wrong arguments; 

p : the package X is required; 

t : the macro is incompatible with package X ; 

v : the wrong version of TeX has been used. 

This scenario can be formalized in logical terms by introducing a variable 
f to denote the presence of compile errors: since each of the facts above 
causes f , we know a — > f , p — > / , etc. Moreover, we know that a package 
cannot at the same time be required and incompatible with the macros. The 
following theory T formalize our knowledge. 

T = {a^f,p^f,t^f,v^f,^pAt)} 

This theory relates the observed effect (the compile error) with its possible 
causes (we used the wrong version of TeX, etc.) Therefore, it can be used to 
find the possible causes: namely, an explanation is a set of facts that logically 
imply the observed effect. Formally, an explanation is a set of variable that 
allow deriving the observed effects from the theory T. However, to make 
sense an explanation has to be consistent with our knowledge, that is, with 
the theory T. 
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This example shows that a given problem of abduction may have one, 
none, or even many possible solutions (explanations). Moreover, a consistent 
and an implication checks are required just to verify an explanation. These 
facts intuitively explain why abduction is to be expected to be harder than 
deduction. This observation has indeed been confirmed by theoretical results. 
Selman and Levesque [SL90] and Bylander et al. [BATJ89] proved the first 
results about fragments of abductive reasoning, Eiter and Gottlob [EG95] 
presented an extensive analysis, and Eiter and Makino have shown the com- 
plexity of computing all abductive explanations [EM02]. All these results 
proved that abduction is, in general, harder than deduction. The analysis 
has also shown that several problems are of interest in abduction. Not only 
the problem of finding an explanation is relevant, but also the problems of 
checking an explanation, or whether a fact is in all, or some, of the explana- 
tions are. 

A common fact about deduction and abduction is that the knowledge 
relating facts may be known in advance, while the particular observation 
may change from time to time. In the example of the car, the fact that the 
dead battery makes the car not to start is always known, while the fact that 
the battery is dead may or may not be true. The possible causes of TeX 
errors are known before a specific error message comes out, etc. 

We can therefore assign two different statuses to the knowledge base and 
to the single facts: while the knowledge base is fixed, the single facts are 
varying. In the example above, T will always reflect the state of the word, 
while / is only true when the TeX complains about something. 

This difference has computational consequences. While the example we 
have shown here does not present any problem of efficiency, larger and more 
complex abduction problems result from the formalization of real-world do- 
mains. The difference of status of T and the observations can then be ex- 
ploited. Indeed, since T is always the same, we can perform a preprocessing 
step on it alone, even before the status of the observations are known. Clearly, 
we cannot explain an observation we do not know. However, this preprocess- 
ing step can be used to perform some computation that would otherwise be 
done on T alone. As a result, finding a solution might take less time when 
the observation finally get known. 

The idea of using a preprocessing step for speeding-up the solving of ab- 
duction problems is not new. For instance, Console, Portinale, and Dupre [CPT96] 
have shown how compiled knowledge can be used in the process of abductive 
diagnosis. 
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Preprocessing part of the input data has also been used in many other 
areas of computer science, as there are many problems with a similar fixed- 
varying part pattern. However, the first formalization of intractability with 
preprocessing is relatively recent [CDLS02]. In this paper, we characterize 
the complexity of the problems about abductions from this point of view. 

2 Preliminaries 

The problem of abduction is formalized by a knowledge base, a set of ob- 
servations, and a set of possible facts that can explain the observations. In 
this paper, we are only concerned about propositional logic. Therefore, the 
knowledge is formalized by a propositional theory T. We usually denote by 
M the set of observations. 

The theory is T must necessarily contain all variables of M, otherwise 
there would be no way of explaining the observations. In general, the theory 
T contains other variables as well, describing facts we do not know whether 
they are true or not. Some of these facts can be taken as part of a possi- 
ble explanation, while others are can not. Intuitively, when we are trying 
to establish the causes of an observation, we want the first cause, and not 
something that is only a consequence of it. In the example of the car, the 
fact that there is no voltage in the starting engine explains the fact that the 
car is not starting up, but it is not an acceptable explanation, as it does 
not tell where the real problem is (the battery). Therefore, the abduction 
problem is not defined only in terms of the theory and the observation, but 
also of the set of possible facts (variable) we would accept as first causes of 
the observation. 

Formally, an instance of abduction is a triple {H, M, T). The observations 
are formalized as M, which is a set of variables. T is a propositional theory 
formalizing our knowledge of the domain. Finally, H is a set of variables; 
these variables are the ones formalizing facts that we regards as possible first 
causes. 

Abduction is the process of explaining the observation. Its outcome will 
therefore be a set of facts from which all observations can be inferred. Since 
we can only use variables of H to form explanations, these will be subsets 
H' C H. Moreover, an explanation can only be accepted if it is consistent 
with our knowledge. This leads to the following definition of the possible 
solutions (explanations) of a given abduction problem (H, M, T) . 
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SOL(H, M, T) = {H' C H \ H' U T is consistent and if' U T |= M} 

We apply this definition to the running example of the TeX file. 

Example 2 The propositional theory of the example shown in the introduc- 
tion is T = {a — > f,p — > /,£ — > — > f,~i(p A i)}. TTie observation is 
the variable formalizing the presence of compiler errors, that is, M = {/}. 
Of the variables of T, all but T can be taken as possible first causes of the 
problem, that is, H = {a,p,t,v}. 

Abduction amounts to finding a set of literals that explain the observation 
f . Formally, this is captured by the constraint H' U T \= M. Note that 
H' = {/} satisfies this formula; this is not an acceptable explanation: "the 
reason of why the file does not compile is that it does not compile" is a 
tautology, not an explanation. This problem is avoided by enforcing H' C H . 

All non-empty subsets of H implies, together with T, the observation M. 
However, the subsets containing both p and t are inconsistent with T . There- 
fore, the set of solution of the problem is given by: 

SOL(H, M,T) = {H' C H \ H ^ 0, {t,p} £ H'} 

This is simply the formal result of our current definition. However, some 
explanations in this set are not really reasonable: for example, the explanation 
is {a,t,v} seems overly pessimistic: the macro has been called in the wrong 
way and a package is required and we used the wrong compiler version. 

The set SOL(H, M, T) contains all explanations we consider possible. 
However, some explanations may be more likely than others. For example, 
explanations requiring a large number of assumptions are often less likely 
than explanations with less assumptions. 

Likeliness of explanations is formalized by an an ordering ■< over the 
subsets of H. Given a specific ^, the set of minimal solutions is defined as 
follows. 

SOL^(H, M, T) = min(SOL(H, M, T), ^) 

The ordering ^ is used to formalize the relative plausibility explanations: 
H' -< H" means that H' is considered more likely to be the "real" cause of 
the manifestations than H" . The ordering -< represents the concept of "at 
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least as likely as" , thus H' = H" holds if H' and H" are equally likely. The 
definition of SOL^ formalizes the principle of choosing only the explanations 
we consider more likely. 

An implicit assumption of this definitions is that the ordering ^ does 
not depend on the set of manifestations. We also assume that ^ is a "well- 
founded" ordering, that is, any non-empty set of explanations has at least 
one ^-minimal element. Therefore, if the set SOL(H, M,T) is not empty, 
then min(SOL(H, M, T), ;<) is not empty as well. 

In this paper we take into account several plausibility ordering. The 
absence of a preference among the explanations can be formalized as the 
ordering ^ that is equal to the universal relation, that is, H' -< H" for any 
pair of sets of variables H' and H" . 

Besides this no- information ordering, the two simplest and most natural 
orderings are C-preference, where an explanation H\ is more likely of H 2 if 
H\ C H 2 , and <-preference, where Hi is preferred to H 2 if it contains less 
hypothesis, that is, \Hi\ < \H 2 \. 

Both these orderings are based on the principle of making as few hypothe- 
ses as possible, and by assuming that all hypotheses are equally likely. Two 
other orderings follows from assuming that the hypotheses are not equally 
likely: the C-prioritization and the ^-prioritization. 

In particular, we assume that the hypotheses are partitioned into equiv- 
alence classes of equal likeliness. Let (Hi,...,H m ) be such a partition. 
By definition, it holds Hi U • • • U H m = H and Hi D Hj — for each 
% 7^ j. The instances of the problem of abduction can thus be written as 
{(Hi, . . . , H m ), M, T). The set of all assumptions H is implicitly defined as 
the union of the classes Hi. We assume that the hypotheses in Hi are the 
most likely, while those in H m are the least likely. 

The C-prioritization and ^-prioritization compare explanations on the 
basis of their relative plausibility. Namely, the explanations that use hy- 
pothesis in lower classes are more likely than explanations using hypothesis 
in higher classes. This idea, when combined with subset containment, de- 
fines the C-prioritization. When it is combined with the cardinality-based 
ordering, it defines the ^-prioritization. Formal definition is below. 

Penalization is the last form of preference we consider. The idea is to 
assign weights to assumptions to formalize their likeliness. Explanations 
with the least total weight are preferred. Weights encodes the likeliness of 
assumptions: the most high the weight of an assumption, the unlikely it is 
to be true. To use penalization, the instance of the problem must include, 
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besides H, M, and T, an n-tuple of weights W = (wi, . . . , w n ), where each 
Wi is an integer number (the weight) associated to a variable hi G H. The 
instance can thus be written (W, H, M, T). 

The considered orderings are formally defined as follows: 

C-preference H' ■< H" if and only if H' C H"; 
<-preference H' ■< H" if and only if \H'\ < \H"\; 

C-prioritization H' ^ H" if and only if H' = H" or there exists i such that 
H'nH m = H" n H m , . . ., H'DH i = H" n Hi, H' n Ri-x C n 

^-prioritization H' ^ if" if and only if either \H' D = D 

for each i, or there exists i such that \H' fl if m | = fl H m \, . . ., 

\H' n = n Hi\, \H' n < \h" n 

penalization i7' ^ if" if and only if X^efr w « ^ Hhj&H" w j- 

Let us consider the use of these orderings on the running example. 

Example 3 The use of ^-preference or <-preference reduces the set of pos- 
sible explanations of the example of the TeX file. Namely, <-preference let 
minimal-size explanations only to be solutions of the problem. The only such 
explanations are {a}, {p}, {t}, and {v}. The explanation {a,t,v}, being not 
minimal, is not a solution of the problem any more. The use of preference 
therefore avoids having as solutions some sets that contains too many hy- 
potheses. Since C-preference only selects explanations that are not contained 
in other ones, the only solutions it produces are {a}, {p}, {t}, and {v}. In 
this case, the two kinds of the preference generate the same solutions, but 
this is not always the case. 

Prioritization allows for a further refinement of the set of solutions by 
exploiting the plausibility ordering over the hypotheses. For example, we may 
assume that the fact that package X is required and that we used the wrong 
version of the compiler are the two most likely hypotheses. Formally, they will 
be part of the first set of assumptions Hi, while the other assumptions will 
therefore go in H 2 . Formally, the problem instance is now ((Hi, H 2 ) , M,T) . 
Both C-prioritization and ^.-prioritization produce {p} and {v} as the only 
minimal explanations. This is because all other explanations either have a 
bigger intersection with H 2 , or an equal intersection with H 2 but a bigger 
intersection with Hi. 
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Finally, penalization requires a weight (an integer number) for each hy- 
pothesis. Let us for example use the set of weights (4,2,4, 1) associated with 
the set of hypotheses (a,p,t,v}. Since larger weights correspond to less likely 
hypotheses, we are assuming that our first and third hypotheses (a and t) are 
the least likely, while p is more likely and v is the most likely. From defini- 
tion, the explanation {v} is the one having the least weight, and is therefore 
the only solution of the problem. 

The basic problem of abduction is that of finding one or more explana- 
tions. However, we have already remarked that none may exist. Therefore, 
the first problem we consider is the existence one: given an instance of abduc- 
tion, does an explanation exist? Another related problem is that of verifying, 
once a set of hypotheses has been found, whether it is really an explanation 
or not. 

Other problems are related to the structure of the explanations. Namely, 
hypotheses that are in all explanations may considered as "sure" conclusions 
of the abductive process. On the other hand, hypotheses that are part of 
some explanations can be regarded as "possible" conclusions. 

The formal definition of these questions as decision problems is as follows. 

Existence: is there an explanation of the observed manifestations? That is, 
SOL(H, M, T) ^ 0? 

Verification: given a set H' C H, is H' a minimal solution? That is, 
H' E SOL^H, M, T)? 

Relevance: given a variable h E H, is there a minimal solution contain- 
ing hi That is, 3H' C H such that H' E SOL±(H, M, T) and h E H'l 

Necessity: is h E H in all, and at least one, minimal solution? That is, 
SOL(H, M, T) ^ and \/H' C H we have that H' E SOL±(H, M, T) 
implies h E H'l 

Dispensability: is h E H such that either there is no solution or there exists 
one who does not contain hi That is, SOL(H, M, T) = or 3H' C H 
such that H' E SOL^(H, M, T) and h & H'l 

Dispensability is the converse of the problem of necessity, since an hy- 
pothesis h is dispensable if and only if it is not necessary. The problem of 
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dispensability is not of much interest by itself, but is sometimes useful for 
simplifying the proofs. 

Clearly, the ordering does not matter for the problem of existence, since 
we consider only well-founded orderings: therefore, an explanation exists 
if and only if a minimal explanation exists. For the other problems, the 
ordering must be taken into account. Different orderings may lead to different 
computational properties. 

In this paper, we assume that T is a 3CNF formula: this assumption does 
not cause a loss of generality unless we want to assume that HUM = Var(T). 

3 Complexity and Compilability 

The basic complexity classes of the polynomial hierarchy [Sto76, GJ79], such 
as P, NP, coNP, etc., are assumed known to the reader. We denote by C, 
C, etc. arbitrary classes of the polynomial hierarchy. The length of a string 
x G X* is denoted by 

We summarize some definitions and results proposed to formalize the on- 
line complexity of problems [CDLS02]. In computational complexity, prob- 
lems whose solution can only be yes or no are the most commonly analyzed. 
Such problems are called decision problems. Any such problem can be formal- 
ized as set of strings, those whose solution is yes. For example, the problem of 
propositional satisfiability (deciding whether a formula is satisfiable or not) 
is characterized by the set of the strings that represent exactly all satisfiable 
formulae. 

The strings that compose the set associated to a problem represent the 
possible problem instances that produce a positive solution. Problems like 
abduction, however, have instances that can be naturally broken into two 
parts: one part is known in advance (T and H) and one part is only known at 
run-time (M). Therefore, the instances of such problems are better encoded 
as pairs of strings. Therefore, a problem like abduction is formalized by a 
set of pairs of strings, rather than a set of strings. We define a language of 
pairs S as a subset of S* x S*. 

The difference between the first and second element of a pair is that some 
preprocessing time can be spent on the first string alone. This is done to the 
aim of solving the problem faster when the second string comes to be known. 
While our final aim is to reduce the running time of this second phase, some 
constraints have to be put on the preprocessing phase. Namely, we impose 
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its result to be of polynomial size. Poly-size function are introduced to this 
purpose: a function / from strings to strings is called poly-size if there exists 
a polynomial p such that, for all strings x, it holds < An 

exception to this definition is when x represents a natural number: in this 
case, we impose ||/(x)|| < p(x). Any polynomial function is polysize, but not 
viceversa. Indeed, a function g is poly-time if there exists a polynomial q such 
that, for all x, g(x) can be computed in time less than or equal to g(||x||). 
Clearly, the running time also bounds the size of the output string; on the 
other hand, even a function requiring exponential running time can produce 
a very short output. The definitions of polysize and polytime function extend 
to binary functions as usual. 

Using the above definitions, we introduce a new hierarchy of classes of 
languages of pairs, the non-uniform compilability classes [CDLS02], denoted 
by ||^C, where C is a generic uniform complexity class, such as P, NP, coNP, 
or Y? 2 . 

Definition 1 (||^>C classes, [CDLS02]) A language of pairs S C S* x S* 
belongs to If^C iff there exists a binary poly-size function f and a language 
of pairs S' G C such that, for all (x, y) G S, it holds: 

(x,y)eS iff (f(x,\\y\\),y)eS' 

Clearly, any problem whose time complexity is in C is also in If^C: just 
take f(x, \ \y\\) = x and S' = S. Some problems in C however belongs to |(^C' 
with C'cC; for example, some problem in NP are in |H>IT P These are in fact 
the problems we are most interested, as the preprocessing phase, running on 
x only, will produce f(x), which allows solving the problem in polynomial 
time. This is important if these problems cannot be solved in polynomial 
time without the preprocessing phase (e.g., they are NP-complete) . 

The class |(^C generalizes the non-uniform class C/poly — i.e., C/poly C 
If^C — by allowing for a fixed part x. We extend the definition of polynomial 
reduction to a concept that can be used with these classes. 

Definition 2 (Non-uniform comp-reduction) A non-uniform comp-reduction 
is a triple of functions (fi, f2,g), where g is polytime and f\ and f'2 are poly- 
size. Given two problems A and B, A is non-uniformly comp-reducible to 
B ( denoted by A < nuC omp B) iff there exists a non-uniform comp-reduction 
(fii f2, g) su ch that, for every pair (x, y) it holds that (x, y) G A if and only 
if{fi(x,\\y\\),g(f 2 (x,\\y\\),y))eB. 
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These reductions allows for a concept of hardness and completeness for 
the classes |p-»C. 

Definition 3 (||^->C-completeness) Let S be a language of pairs and C a 
complexity class. S is |(^C-hard iff for all problems A e |p->C we have that 
A <nucom P S. Moreover, S is \\^C-complete if S is in \\^C and is \\^C-hard. 

The hierarchy formed by the compilability classes is proper if and only if 
the polynomial hierarchy is proper [CDLS02, KL80, Yap83] — a fact widely 
conjectured to be true. 

Informally, ||^NP-hard problems are "not compilable to P". Indeed, if 
such compilation were possible, then it would be possible to define / as the 
function that takes the fixed part of the problem and gives the result of com- 
pilation (ignoring the size of the input), and S' as the language representing 
the on-line processing. This would implies that a ||^NP-hard problem is in 
If^P, and this implies the collapse of the polynomial hierarchy. In general, a 
problem that is |p->C-complete for a class C can be regarded as the "tough- 
est" problem in C, in the assumption that preprocessing the fixed part is 
possible. 

While ||^C-completeness is adequate to show the compilability level of 
a given reasoning problem, proving it requires finding a nucomp reduction. 
We show a technique that let us reuse, with simple modifications, the poly- 
time reductions that were used to prove the usual (uniform) hardness of the 
problem. Namely, we present sufficient conditions allowing for a polynomial 
reduction to imply the existence of a nucomp reduction [LibOl]. 

Let us assume that we know a polynomial reduction from the problem A 
to the problem B, and we want to prove the nucomp-hardness of B. Some 
conditions on A should hold, as well as a condition over the reduction. If all 
these conditions are verified, then there exists a nucomp reduction from *A 
to B. 

Definition 4 (Classification Function) A classification function for a prob- 
lem A is a polynomial function Class from instances of A to nonnegative 
integers, such that Class(y) < \\y\\. 

Definition 5 (Representative Function) A representative function for a 

problem A is a polynomial function Repr from nonnegative integers to in- 
stances of A, such that Class (Repr (n)) = n, and that \\Repr(n)\\ is bounded 
by some polynomial in n. 
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Definition 6 (Extension Function) An extension function for a problem 
A is a polynomial function from instances of A and nonnegative integers to 
instances of A such that, for any y and n > Class{y), the instance y' = 
Exte(y, n) satisfies the following conditions: 

1. y G A if and only if y' G A; 

2. Class(y') = n. 

Let us give some intuitions about these functions. Usually, an instance 
of a problem is composed of a set of objects combined in some way. For 
problems on boolean formulas, we have a set of variables combined to form a 
formula. For graph problems, we have a set of nodes, and the graph is indeed 
a set of edges, which are pairs of nodes. The classification function gives the 
number of objects in an instance. The representative function thus gives an 
instance with the given number of objects. This instance should be in some 
way "symmetric", in the sense that its elements should be interchangeable 
(this is because the representative function must be determined only from 
the number of objects.) Possible results of the representative function can 
be the set of all clauses of three literals over a given alphabet, the complete 
graph over a set of nodes, the graph with no edges, etc. 

Let for example A be the problem of propositional satisfiability. We can 
take Class(F) as the number of variables in the formula F, while Repr(n) can 
be the set of all clauses of three literals over an alphabet of n variables. Fi- 
nally, a possible extension function is obtained by adding tautological clauses 
to an instance. 

Note that these functions are related to the problem A only, and do not 
involve the specific problem B we want to prove hard, neither the specific 
reduction used. We now define a condition over the polytime reduction from 
A to B. Since B is a problem of pairs, we can define a reduction from A 
to B as a pair of polynomial functions (r, h) such that x G A if and only if 
(r(x),h(x)) G B. 

Definition 7 (Representative Equivalence) Given a problem A (having 
the above three functions), a problem of pairs B, and a polynomial reduction 
(r, h) from A to B, the condition of representative equivalence holds if, for 
any instance y of A, it holds: 

(r(y),h(y)) eB iff (r(Repr(Class(y)), h(y)) G B 

The condition of representative equivalence can be proved to imply that 
the problem B is |p~»C-hard, if A is C-hard [LibOl]. 
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4 Compilability of Abduction: No Ordering 



In this section we analyze the problems of existence of explanation, expla- 
nation verification, relevance, and necessity, for the basic case in which no 
ordering is defined. Formally, we want to determine whether the complexity 
of the problems related to SOL(H, M, T) decrease thanks to the preprocess- 
ing step on H and T. 

We first give an high-level explanation of the method we use to prove 
the incompilability of the considered problems. We begin by applying the 
method to the problem of existence of explanations, and then we used it for 
verification, relevance and necessity. 

4.1 The Method 

The problem of deciding whether there exists an explanation for a set of 
manifestations is S^-hard [EG95]. Therefore, there exists a polynomial re- 
duction from another S^-hard problem to this one. In order to prove it is 
also I^S^-hard we can show that the other problem has the three functions, 
and the reduction satisfies the condition of representative equivalence. Un- 
fortunately, this is not the case. As a result, we have to look for another 
reduction. 

Such a reduction should be as simple as possible. In general, the more 
similar two problems are, the easier it is to find a reduction. What is the 
S^-hard problem that is the most similar to the problem of existence of 
explanation? Clearly, the problem itself is the most similar to itself. 

The theorem of representative equivalence is indeed about a reduction 
between two problems A and B, but it does not forbid using the same prob- 
lem: it only tells that, if we have a reduction from an arbitrary S^-hard 
problem A to B, satisfying representative equivalence, then B is Ip-^Ef-hard. 
Nothing prevent us from choosing A = B. This technique can be formalized 
as follows: 

• show that there exists a classification, representative, and extension 
functions for the problem B; 

• show that there exists a reduction from B to B satisfying representative 
equivalence. 
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The most obvious reduction from a problem to itself is the identity. In 
our case, however, identity does not satisfy the condition of representative 
equivalence. As a result, we have to look for another reduction. 

Before showing the technical details of the reductions used, we point out 
an important feature of this technique. Since the condition of representative 
equivalence tells that B is ||^C-hard if A is C-hard, using A = B we prove 
that B is |p-» C-hard whenever B is C-hard. This result holds even if a precise 
complexity characterization of B is not known. For example, if we only know 
that B is in Ef, but do not have any hardness result, we can still conclude 
that B is |^NP-hard if it is NP-hard, it is |^coNP-hard if it is coNP-hard, 
it is I^Sf-hard if it is S^-hard, etc. 

In order to simplify the following proofs, we denote with n(X) the set of 
all distinct clauses of length 3 on a given alphabet X = {x±, . . . ,x n }. Since 
the theory T is in 3CNF by assumption, we have that T C II(V), where V 
is the set of variables appearing in T. 

4.2 Existence of Solutions 

In order to define a reduction from the problem of existence of solutions to 
itself, we first consider the function / from abduction instances to abduction 
instances defined as follows: 



f((H,M,T)) = (H>,M',T>) 

where: 

H' = H U CUD 

M' = M U { Ci | 7i e T} U {<U l^i^T} 

V = {-iCj V ->di | 7j G U(H U X)} U ^ 7j | 7j G U(H U X)} 

In these formulae, X denotes the alphabet of T, while C and D are sets 
of new variables in one-to-one correspondence with the clauses in U(HUX). 
Note that, by definition, T is a subset of U(H U X). The following lemma 
relates the solutions of (H, M, T) with the solutions of {H', M', T'). 

Lemma 1 Let f be the function defined above. For any H , M , T , it holds: 

SOL(f((H, M, T») = {SU{a | 7, e T}U{d t \ 1{ £ T} | S G SOL((H, M, T))} 



15 



Proof. We divide the proof in three parts. In the first part, we prove that 
any solution of f((H,M,T)) contains exactly the literals q and dj that are 
in M' . In the second part, we prove that, if S' is a solution of f((H, M,T)), 
then S'\(C U D) is a solution of {H, M, T); the third part is the proof of the 
converse. 

1. We prove that S' n (C U D) = {a | 7i G T} U {d; | 7i £ T}. Let 
i? = { c . I 7 - g T} U {d» I 7i £ T}. Since # C M', we have that 
S'UT' \= R. If a G R, then 5'Uf |= a. Since T" does not contain any 
positive occurrence of q, the theory S'UT' can imply q only if Cj G S". 
The same holds for any d^ G i?. This proves that S" H (C U D) D R. 
Since i? contains either Cj or for any i, the same holds for S'. No 
other variable in CUD can be in 5", otherwise S' would be inconsistent 
with T', which contains the clauses — iq V 

2. Let S' be an element of SOL({H', M', T')). We prove that S = S'\(CU 
D) G SOL((H, M,T)). The point proved above shows that, for each i, 
S' contains either q or di, depending on whether 7$ G T. As a result: 

S'UT' = S U {a I 7i G T} U I 7i £ T} U {-.Ci V -.dj} U {c t -> 7l } 
= 5 U {q I 7i G T} U {^ I 7^ T} U T 

As a result, S'UT is consistent because the above formula is. Moreover, 
since the above formula implies M, and each variable mCVJD appears 
only once, it also holds S U T \= M. As a result, S is a solution of 
(H,M,T). 

3. Let S G SOL((H, M, T)), and let 5" = SU { Ci \ 7i G T} U {d* | 7i ^ T}. 
Since S" U T' is equivalent to S U T U {c^ | 7i G T} U {d^ | 7i ^ T}, then 
S" is a solution of (H',M',T'}. 

The claim is thus proved. □ 

This lemma shows that any abduction instance can be converted into 
another one in which the set H and the theory T only depends on the number 
of variables of the original instance. This reduction can be used to build a 
reduction satisfying the condition of representative equivalence. 
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Lemma 2 Let c be a positive integer number, and let g c be the following 
function: 

g c ({H, M, T)) = (HU {h m+1 , . . . , h c }, M, T U {x r+1 V ^x r+1 , . . . , x c V -.xj) 
w /iere r = \Var(T)\H\. It holds 

SOL(g c ((H, M, T))) = {SUH 1 \ S G SOL((H, M, T)) and H' C { Vl+i> • • • , /i c }} 

Proof. The instance g c ((H, M,T)) only differs from (H,M,T) because of 
the new assumptions /i|H|+i, ■ ■ ■ ,h c , which are not even mentioned in T, and 
new tautological clauses to T. Therefore, any explanation of (H, M, T) is 
also an explanation of g c ({H, M,T)). The only difference between these two 
problems is that assumptions in h\n\+i, ■ ■ ■ ,h c can be freely added to any 
explanations. □ 

We now define the classification, representative, and extension functions 
for the basic problems of abduction. First, the classification function is given 
by the maximum between the number of variables in H and the number of 
variables in T but not in H: 

Class((H, M,T)) = max(|#|, \Var(T)\H\) 

The representative instance of the class c is given by an instance with 
c possible assumptions, c other variables, and T composed by all possible 
clauses of three literals over these variables: 

Repr(c) = ({hi, ...,h c },$, U({h u . . . , h c } U {xi, . . . , x c })) 

The extension function is also easy to give. For example, we may add to 
T a set of tautologies with new variables. 

Ext((H, M, T),m) = (H, M, Tl>{x r+1 V^x r+1 , . . . , x m V^x m }) where r = \Var(T)\H\ 

These three functions are valid classification, representative, and exten- 
sion functions for the problem of existence of explanation; they are also valid 
for the problems of relevance and necessity. 

We are now able to show a reduction satisfying the condition of represen- 
tative equivalence. Let % be the reduction defined as follows. 

i((H,M,T)) = f(gcia8.«HM,T»((H,M,T))) 

The following theorem is a consequence of the fact that % satisfies the 
condition of representative equivalence. 
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Theorem 1 The problem of establishing the existence of solution of an ab- 
ductive problem is W^T^-hard. 

Proof. By the above two lemmas, i({H,M,T)) has solutions if and only if 
(H, M, T) has solution. Therefore, i is a valid reduction from the problem 
of solution existence to itself. The fixed part of i((H, M,T)) only depends 
on the class of the instance (H,M,T). As a result, this reduction satisfies 
the condition of representative equivalence. Since the problem of existence 
of solutions is Xf-hard [EG95], it is also |p->Xf-hard. □ 



4.3 Verification 

We consider the problem of verifying whether a set of assumptions is a pos- 
sible explanation, still in the case of no ordering. An instance of the problem 
is composed of a triple (H, M, T) and a specific subset H a C H we want 
to check being an explanation. Formally, this problem amounts to checking 
whether H a U T is consistent and H a U T |= M. The varying part is composed 
of H a and M. Formally, an instance of the verification problem is a 4-tuple 
(H,H a , M, T), where H a C H. 

The first step of the proof is that of finding the three functions (classifi- 
cation, representative, and extension). The functions of the last proof only 
require minor changes to be used now. 

Class((H,H a ,M,T}) = max(\H\,Var(T)\H) 

Repr(c) = ({h 1 ,...,h c },dS,dS,U({h 1 ,...,h c }U{x 1 ,...,x c }} 
Exte((H, H a , M, T)) = (H, H a , M, T U {x r+1 V ^x r+1 , . . . , x c V ^x c )}) 

where r = \Var(T)\H\ 

We define two functions /' and g' c to be similar to the functions / and g c 
of the last section, except for the addition of a candidate explanation H a . 

f'((H,H a ,M,T)) = (H',H a U{c i \^eT}U{d i \ li ^T},M , ,T') 

where (H', M', T') = f({H, M, T)) 

g' c ((H,H a ,M,T)) = (H',H a ,M',T f ) 

where (H' , M' ,T>) = g c ((H, M,T)) 

These functions can be composed to generate a function that satisfies 
representative equivalence. This way, we prove the nucomp-hardness of the 
problem of verification. 



18 



Theorem 2 The problem of verification with no ordering is \\^D P -complete. 

Proof. By Lemma 1 and Lemma 2, H a C H is a solution of g c {(H, M, T)) if 
and only if it is a solution of (H, M, T), and that H a \J {q | 7^ G T}U {rfj | 73 G" 
T} is a solution of /((#, M, T)) if and only if H a is a solution of (H, M, T). 

As a result, both /' and g' c are reductions from the problem of verification 
to itself. Moreover, their composition %' satisfies representative equivalence, 
since the fixed part of i'({H, H a , M, T)) only depends on the class of the in- 
stance {H, H a , M, T). We can then conclude that the problem of verification 
is hard for the compilability class that corresponds to the complexity class it 
is hard for. □ 



4.4 Relevance, Dispensability, and Necessity 

We make the following simplifying assumption: given an instance of abduc- 
tion (H, M,T), where H = {hi, . . . , h m }, the problem is to decide whether 
the first assumption hi is relevant /dispensable/necessary. Clearly, the com- 
plexity of these problems is the same, as we can always rename the variables 
appropriately. 

Theorem 3 The problems of relevance and dispensability with no ordering 
is \\^Yj2-hard, while necessity is \\^>H?>-hard. 

Proof. By Lemma 1 and Lemma 2, i((H, M, T)) is a reduction from the prob- 
lem of relevance to the problem of relevance. Indeed, for any H a C H, the set 
H a U{ci I 7; E T}U{di I 7i £ T} is a solution of f(g c {{H, M,T))) if and only if 
H a is a solution of {H, M, T). As a result, hi is relevant / dispensable/necessary 
for (H, M, T) if and only if it is so for f(g c ((H, M, T))). 

The function i satisfies representative equivalence, since the fixed part of 
i((H,M,T)) only depends on the class of (H, M, T). What is left to prove 
is the existence of the three functions. We can use the same three ones used 
for the problem of existence of solutions. □ 



5 Compilability of Abduction: Preferences 

In this section, we consider the problems of verification, relevance, and ne- 
cessity when the ordering used is either < or C. These orderings have in 
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common the fact that the instance of an abduction problem is simply a 
triple (H,M,T), whereas the orderings of the next section employee classes 
of priority or weights that are part of the instances. The problem of existence 
is the same as with no ordering, as these orderings are well founded. 

5.1 Some General Results 

We give some general results about the problem of abduction in the case in 
which an ordering on explanation is given. In order to keep results as gen- 
eral as possible, we consider an arbitrary ordering ■< satisfying the following 
natural conditions. 

Meaningful. The ordering ^ is meaningful if, for any variable h and any 
pair of sets H' and H" such that h f£ H' U H" it holds: 

H' U {h} r< H" U {h} iff H' < H" 

Intuitively, a meaningful ordering compares two explanations H' and 
H" only on the variables they differ. 

Irredundant The ordering -< is irredundant if, for any pair of sets H' and 
H" it holds: 

H' C H" =>- H' -< H" 

Irredundancy formalizes the natural assumption that hypotheses that 
are not necessary should be removed. 

We determine the compilability of abduction with preference in the same 
way we did in the case of no ordering: we show that the function % is a 
polynomial reduction from the problems of abduction to themselves, and 
that it satisfies the condition of representative equivalence. To this aim, we 
need the analogous of Lemma 1 and Lemma 2. 

Lemma 3 If < is a meaningful ordering, it holds: 

SOL±(f((H, M, T») = {SU{a | 7i e T}\j{di | 7i £ T} | S E SOL^((H, M, T))} 
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Proof. We use the result of Lemma 1. Namely, since all solutions of f((H, M, T)) 
coincide on CU D, these variables are irrelevant thanks to the fact that ^ is 
meaningful. 

Formally, we have: 

SeSOL^f((H,M,T))) 

S G SOL(f({H, M, T))) and flS' G SOL(f((H, M,T») . S' r< S 

& s = s 1 u { Cl | 7i g t} u {d, | 7l £ r}, 

G 50L(fT, M, T) and 
fiS[ G SOL((H,M,T}) such that 

SJ U {c, | 7i G T} U {d t | 7i £ T} -< S 1 U {c, | 7l G T} U {d, | 7i £ T} 
^ 5 = 5i U {q | 7i G T} U {dj | 7i £ T}, 5 : G SOL(#, M, T) and 
fiS[ G SOL((H,M,T)) .S'^Si 

S = S 1 U { Ci | 7i G T} U {dj | 7i £ T} and G SOL±((H, M, T)) 

This proves the claim. □ 
We can also prove the analogous of Lemma 2. 

Lemma 4 Lei c be a positive integer number and let g c be the following 
function: 

g c ((H, M, T» = (HU {h\ m+1 , h c }, M, T U {x r+1 V ^x r+1 , . . . , x c V -x c }) 
where r = \Var(T)\H\. If < is an irredundant ordering, it holds: 
SOL±(g c ((H, M, T))) = SOL^H, M, T» 

Proof. Similar to the proof of Lemma 2, but now the hypotheses in {h\n\+i, • • • , h c } 
are all irrelevant; therefore, they are not part of any minimal explanation. □ 

These lemmas can be used to prove incompilability of abduction when an 
irredundant and meaningful ordering is used. 

5.2 Verification 

We consider the problem of verifying whether a set of assumptions is a min- 
imal explanation according to the orderings < and C. More generally, we 
prove the following theorem for any meaningful and irredundant ordering. 
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Theorem 4 If ^ is a meaningful and irredundant ordering, verifying whether 
a set of assumptions is a minimal explanation is \\^C-hard for any class C 
for which the problem is C-hard. 

Proof. The same classification, representative, and extension functions used 
for the case of no ordering; can be used for this well. 

Let now consider the functions /' and g' c . From Lemma 3 and Lemma 4 
it follows that they are reductions from the problem of verification to itself. 
Moreover, their composition i' satisfies representative equivalence. □ 

5.3 Relevance, Dispensability, and Necessity 

We make the following simplifying assumption: given an instance of abduc- 
tion {H, M, T), where H = {hi, . . . , h m }, the problem is to decide whether 
the first assumption hi is relevant /dispensable/necessary. There is no loss of 
generality in making this assumption, as we can always rename the variables 
appropriately. 

Theorem 5 If ^ is a meaningful and irredundant ordering, then the prob- 
lems of relevance/dispensability/necessity are \\^C-hard for any class C of 
the polynomial hierarchy for which they are C-hard. 

Proof. From Lemma 3 and Lemma 4, it follows that the reduction i is a 
reduction from the problems of relevance/dispensability /necessity to them- 
selves, if z< is meaningful and irredundant, and it also satisfies representative 
equivalence. □ 

Since C and < are meaningful irredundant orderings, their complexity 
implies their compilability characterization. 

Corollary 1 Relevance and dispensability using C are \\^>Y%-hard, while us- 
ing < they are nucompAf [log n]-hard. Necessity is \\^Y\^-hard and nucompAf [lo 
hard, using C and <, respectively. 

6 Compilability of Abduction: Prioritization 
and Penalization 

We consider the cases in which the ordering over the explanations is defined 
in terms of a prioritization. The instances of the problem are different from 
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those of the previous section, since H is replaced by a partition of assumptions 
{Hi, . . . , H m ). 

In the cases of ^-prioritization and C-prioritization, the induced order- 
ing -< is meaningful and irredundant. However, the results on meaningful 
irredundant ordering cannot be directly applied because, in Theorem 4 and 
Theorem 5, we assumed that the instances have the form {H,M,T), while 
now they have the form {{Hi, . . . , H m ), M,T). Therefore, we have to find 
new classification, representative, and extension functions. 

We first consider the problem of verification, and prove its nucomp- 
hardness. Then, we move to the problems of relevance, dispensability, and 
necessity. As for the case of <-preference and C-preference, we employee a 
sort of normal form, in which the assumption we check is the first one. 

6.1 Verification 

First of all, we show the classification, representative, and extension func- 
tions for the problem of verification. The instances of the problem include a 
"candidate explanation" H a . 

Class{{{Hi,...,H m ),H a ,M,T)) 

= max(m, \Hi\, . . . , \H m \, \Var{T)\ U H t \) 
Repr(c) 

= (<{&!,... ,^},...,{^...,&^ 

Exte({{Hi,...,H m ),H a ,M, T),m) 
= {{Hi, . . . , H c ), H a , M, T U {x r+ i V . . . , x m V -^x m }) 

where r = \Var(T)\ U Hi\ 

These functions can be easily proved to be valid classification, repre- 
sentative, and extension functions. What is missing is a reduction from 
the problem of verification to itself satisfying the condition of representative 
equivalence. 

To this extent, we use two functions /" and g" that are similar to / and g c , 
respectively. In particular, f({{Hi, H m ),H a , M, T)) = {{H[, H'J,H' a , M' , T), 
where: 

H[ = HiUCUD 
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H' 2 = H 2 



H' m = H m 

H' a = H a U {a I 7, e T} U {d, I 7^ T} 

M' = M U { Cl | 7i G T} U {d, | 7l £ T} 

T' = {-c, V -.di | 7l G U(Var(T) U(J^)}U {c, -> 7l | 7l G n(Far(T) U |J Hi 

Besides the partition of the assumptions, this is exactly the function used 
in Lemma 1. As a result, we have that: 

SOL(((H[, H'J,M', T'» = {SU{ Cl | 7l G T}U{^ | 7i £ T} | S G SOL({{H u . . 

Since ^ is a meaningful irredundant ordering, the same property holds 
replacing SOL with SOL^. The last step is to define a function similar 
to g c . This is done as follows. 

g'MHi,-..,H m ),H a ,M,T)) = 

((Hi U {^1+1, • • • , h\}, ...,H m U {hfij m \ +1 , h™}, {hi, h c c }),H a , 
M,TU {x r+l V ->x r+ i, . . . , x c V -iz c }) 
where r = | Var (T) \ (J iLj 

In words, each Hi is extended with new assumptions to make it contain 
exactly c assumptions. Some new classes of assumptions Hi are added, in 
such a way the resulting instance contains exactly c classes of assumptions. 
Finally, T is extended with tautologies over new variables, in such a way the 
variables of the new theory that are not assumptions are exactly c. 

The resulting instance is defined in such a way all its relevant num- 
bers (number of classes of assumptions, number of assumptions in each 
class, number of other variables in the theory) coincide. The analogous of 
Lemma 4 holds: the solutions of ({Hi, . . . , H m ), H a , M, T) and the solutions 
of g"(((Hi, . . . , H m ), H a , M, T)) coincide. This is due to the fact that g" c only 
introduces new variables that are irrelevant to the minimal solutions. 

Theorem 6 The problem of verification for any prioritization based on a 
meaningful and irredundant ordering is \\^G-hard for any class C for which 
it is C-hard. 
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Proof. The composition of i" of /" and g" is a reduction satisfying represen- 
tative equivalence. 

i"(((Hi, H m ),H a , m, r» = f(g^ lassmi ,...,H m} , Ha M,T))((( H ^ ■ • • , #™>, ^ T ») 

The fact that is a reduction from the problem of verification to itself eas- 
ily follows from the fact that both /" and g" are. Moreover, the result of 

f(9ciass(({H 1 ,...,H m ),H a ,M,T))((( H ^ ■ • • > H m) , #a, M, T))) is an instance in which 
the number of classes of assumption, of variables in each class, and the num- 
ber of other variables, all coincide with the class of the original instance. 
The function /" produces an instance in which everything but M and H a 
depends only on these numbers. As a result , the function i produces an in- 
stance in which everything but M and H a depends on the class of the original 
instance only. As a result, this function i" is a reduction from the problem of 
verification to itself, satisfying representative equivalence, which implies the 
incompilability of the problem. □ 

6.2 Relevance and Necessity 

We restrict the problems to the case the assumption we want to check for 
relevance/dispensability /necessity is the first variable of H-y. The problems 
have the same complexity of the general ones (in which the assumption can be 
an arbitrary one.) This, however, cannot be proved with a simple renaming 
of the variables, as we did for the case of preference. 

Theorem 7 Let <be a meaningful and irredundant ordering. It holds: 

h l j is relevant/necessary for {{Hi, . . . , H m ), M, T) 

iff 

t is relevant/necessary for 
{{{t, s}, Hi, . . . , H m ),M U {u, v}, TU{h)^u,t^v,s^u,s^ v}) 

Proof. We first give an informal sketch of the proof. The set of solutions 
(with no ordering) of the first and the second instances only differ because 
the explanations for the second instances must contain either s or both 
and t. 

The explanations of the second instances are first compared on the as- 
sumptions in Hi, . . . , H m , and then on {s, t}. Therefore, the ordering for the 



25 



second instance is a refinement of the ordering of the first one. Namely, a 
minimal solution of the second instances is either a minimal solution of the 
first one plus s, or a minimal solution of the first one plus t. However, the 
latter is a solution only if it contains h l y Therefore, the presence of a solution 
containing hj in the first instance is equivalent to the presence of a solution 
for the second instance containing t. 

The formal proof is as follows. Let (H, M, T) be the first instance and 
(H', M' , T) be the second one. 

1. S" G SOL(H', M', T) implies S'\{s, t} G SOL(H, M, T). 

This can be proved as follows. First, since S' U V is consistent, it 
follows that S' U T is consistent as well (because T C f), which also 
implies that (S'\{s,t}) U T is consistent. 

Let us now prove that (iS'\{s, £}) UT \= M. By assumption, we have 
S' U T' \= M' . The following chain of implications leads to the claim. 

S'UT\=MU {u, v} 

S' U V |= M 

S' U T U -iM is inconsistent 

(S'\{s, t}) U T U (S' n {s, t}) U {h) -»• it, t -> it, t -> s -> U -iM is inconsistent 
since it and i> appears only positively, set u = y = true 

(S"\{s, t}) U T U (S' n {s, 0) U -iM is inconsistent 
s and t appears (at most) once: they can be removed 

(S'\{s, t})UTU ->M is inconsistent 

«t})UTHM 

2. 5' G SOLVIT, M',T) implies S"\{s,*} G SOL±(H,M,T). 
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Proved by reductio ad absurdum. Assume that S' G SOL^(H', M', X"), 
but that S'\{s,t} g" SOL±(H,M,T). As proved above, S'\{s,t} G 
SOL(H, M,T). As a result, it is not minimal: there exists another 
S" G SOL(H,M,T) such that S" is better than S'\{s,t}. As proved 
above, S" U {s, t} G SOL^(H', M', V). Moreover, S" U {s, t} is better 
than S', because s and t are in the lowest class of the prioritization. 

3. If/ij G S,thenS e SOL-<(H, M,T) if and only K SU{t} G SOL±(H',M',T). 

By the point 1 and 2 above, if SU{t} is a minimal solution of the second 
instance, then S is a minimal solution of the first one. We prove the 
converse. 

First of all, S U {t} is solution of the second instance. What is left to 
prove is its minimality. This is also easy: removing t leads to a set of 
assumptions which does not explain v. If removing some variable from 
S leads to another solution, then S is not minimal. 

4. If/ij G" S, then S G SOL^(H, M, T) if and only if S'U{s} G SOL^(H', M', T'). 

The "if" direction is easy. Let us assume that S is a minimal solution 
of the first instance. Then S U {s} is a solution of the second one. 
Let us prove that it is minimal. We cannot remove variables from S, 
otherwise S would be not minimal. As a result, the only other possible 
explanations that can be preferred are S'U{t} and SU®. None of them 
is a solution, because they do not imply u. 

It is now possible to prove the claim. If there exists a minimal solution 
of the first instance containing hp then there exists a minimal solution of 
the second one containing t. On the other hand, if no minimal solution 
contains hj, then all corresponding minimal solutions of the second instances 
contains s, which means that t is in none of them. Therefore, relevance and 
necessity of h l - on the first instance are equivalent to relevance and necessity, 
respectively, of t in the second instance. □ 

As a result of this theorem, we can assume that relevance or dispensability 
are evaluated w.r.t. the first variable in Hi. In order to prove that these 
problems are not compilable, we give a classification, representative, and 
extension function. 

Class(((Hi, H m ),M, T)) = max(m, 1^1, . . . , \H m \, \Var(T)\(Hi U • • • U H, 
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Repr(c) = (({h 1 1 ,...,h 1 c },...,{h c 1 ,...,h c c }),(b, 

U({h\,...,h 1 c }U---U{h c 1 ,...,h c c }U{x 1 ,...,x c })) 
Ext({{H u ...,H m ),M,T),m) = ({H u . . . , H m ), M, T U {x r+1 V -ur r+1 , . . . , x m V -x m }> 
where r = |Var(T)\(ifi U • • • U ff m )| 

Given these three functions, all is needed is a reduction from the problem 
of relevance to itself satisfying representative equivalence. The function i" 
cannot be used only because the instance it deals with contains the set of 
assumptions H a . However, removing this part of the instance both from its 
argument and its result, we obtain a reduction with the right properties. 
We can thus conclude that the problems of relevance, dispensability, and 
necessity are incompilable. 

Theorem 8 Let -<be a meaningful and irredundant ordering. The problems 
of relevance, dispensability, and necessity for the problem of prioritized ab- 
duction are \\^G-hard for any class C of the polynomial hierarchy for which 
these problems are C-hard. 

As a result, we easily obtain the compilability properties of the problem 
of prioritized abduction using the orderings C and <. 

Theorem 9 Relevance and dispensability are \\^T^-hard if C is used, and 
W^A^-hard if < is used instead. 

The compilability of relevance and dispensability in the case of penaliza- 
tion is an easy consequence of the last theorem, as relevance with < (pri- 
oritized) can be directly translated (using a nucomp reduction) to relevance 
with penalization. 

Corollary 2 Relevance and dispensability are \\^A^-hard, in the case of 
penalization. 

7 The Horn Case 

The Horn case can be dealt with using the same technique of the general case. 
Since, however, only Horn clauses are allowed, each time we use Tl(H U X), 
which contains all clauses of three literals over H U X, we have to replace it 
with the Hh(H U X) that contains all Horn clauses of three literals over the 
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set H U X. The reductions we used employ clauses — iq V -irf, and -iCj V 7,, 
which are Horn if 7^ is Horn. The reduction used in Theorem 7 also involves 
Horn clauses only. Therefore, all results holding for the general case hold 
for the Horn case as well. Namely, all problems about Horn clauses are 
|p->C-hard for the same classes C they are C-hard. An important feature of 
reduction from the same problem is that it allows proving nucomp-hardness 
result even for a restriction of the problem, provided that these reduction do 
not transform an instance into a non- valid one (e.g., unless an Horn instance 
is mapped into a non-Horn one.) 

The even more restricted case of definite Horn clauses, however, cannot 
be dealt with in the same manner. Indeed, the clauses — iq V ->di, are not 
definite. Some problems, however, becomes polynomial, in this case. Namely, 
all problems in the case of no order are polynomial, as well as necessity for 
C-preference. We only show that a reduction for the case of <-preference. 
As before, the problem is that of checking whether hi is in an explanation 
of minimal size of [H,M,T). Since hi is part of H, we regard (H,M,T) 
as being the instance of the problem. The classification, representative, and 
extension functions are as usual (tautologies are definite Horn clauses.) 

The reduction we use is based on the following function /, where n — \H\. 

f((H,M,N)) = (H',M',T') 
where 

H' = HI) {4 I 7i e U H (XUH), 1 < j <n + l} 
M' = M U {4 I 7; G T, 1 < j < n + 1} 
T = {7i V \/{-^4 I 1 < j < n + 1} I 7i e U H (X U H)} 

The idea is simply that of replicating each variable 4 f° r n + 1 times. 
This way, if S C H, then a clause 7« holds in S U T only if S contains all 
clauses 4- 

The reduction is based on the following two facts: 

1. definite Horn clauses are always consistent with sets of positive literals; 

2. checking the existence of explanations is polynomial. 

Therefore, the instance {H, M, T) can be solved by first checking whether 
it has explanations. If it has, we can reduce it to (H',M',T'). Being both 
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T and T' definite Horn theories, consistency is not an issue. In other words, 
S C H is an explanation of the first instance if and only if S U T |= M, and 
the same for the second instance. 

Lemma 5 For any C C C such that \C'\ < n, it holds that S is an expla- 
nation of {H, M, T) if and only if S U {cj \ ji G T} U C is an explanation of 
fiUl.M.T}). 

Proof. The definition of explanation for definite Horn clauses is: S C H is 
an explanation if and only if S U T |= M . Consistency is not relevant, as any 
definite Horn theory is consistent with any set of positive literals. 

Let us first assume that S is an explanation of (H, M, T), that is, SUT \= 
M. Since {cj | 7$ G T} U T' implies {c^ | 7^ G T} U T, we conclude that 
S U {c{ I 7i e T} U T implies M U {cj | 7^ e T}. The set S U {c£ | 7* G T} is 
therefore an explanation because the latter set is indeed M' . The set C is 
not relevant to this part of the proof. 

Let us now assume that 5" = S U {cj \ 7$ G T} U C is an explanation 
of f((H,M,T)). Since |C"| < n, then 5" does not contain all cj for any i. 
Therefore, all clauses that are not in T contains at least an unassigned cj 
in S' U T'. Therefore, these clauses are cannot be used to derive a single 
literal in M' . As a result, S U {4 | ji G T} U X" |= M'. This is equivalent to 
SUT \= M, that is, S is an explanation of (H, M, T). □ 

This lemma can be used to relate the minimal explanations of the two 
instances. 

Lemma 6 If (H, M, T) has explanations, then S is one of its minimal ex- 
planation if and only if S U {cj \ 7* G T} is a minimal explanation of 
f((H,M,T)). 

Proof. The lemma above implies that S is an explanation if and only if 
S U {cj I 7j G T} is an explanation, as this is the case of C = 0. Let us now 
prove that the minimality of these two explanations coincide. 

Let us first assume that S is a minimal explanation. We prove that 
S' = S U {c- I 7i G T} is a minimal explanation of (H',M',T'). By the 
lemma above, S' is an explanation; we have therefore only left to prove that 
it is of minimal size. Assume that S" is another explanation of (H', M',T'). 
By construction, S" contains {cj \ 7^ G T}. Therefore, S" can be smaller than 
S' only if S"\{4 | 7; G T} is smaller than S'\{c{ | ^ G T}. Since the latter 
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coincide with S'C\H, whose size is bounded by n, we have that \S"\{c- | 7$ G 
T}\ < n. Therefore, S" can be written as S" = S'" U {cj 7 « e T} U C 
with | S 1 " U C"| < \S\. The latter inequality implies \C'\ < n: by the lemma 
above, S'" would be an explanation of (H,M,T). Since \S"'UC'\ < \S\, then 
|S""| < \S\, that is, S is not be minimal. 

Let us now assume that S' = S U {c- | 7, G T} is a minimal explanation 
of (H',M',T'), and prove that S 1 is a minimal explanation of (H,M,T). 
Assume, indeed, that 5"' is a smaller explanation of {H, M, T). By the lemma 
above, S"'U{c| | 7« G T} would then be an explanation of (H 1 , M', T') smaller 
than 5". □ 

The reduction can be defined as for the Horn case, by taking into account 
the fact that the original instance {H, M, T) may not have any explanation. 
Such a reduction ratifies the condition of representative equivalence, thus 
proving that problems about <-preference are ||^C-hard whenever they are 
C-hard. Similar reductions can be defined for the other orderings. 



8 Conclusions 

In this paper, we have shown that logic-based abduction cannot be simpli- 
fied by preprocessing the theory T and the hypotheses H. In particular, this 
result holds for various kinds of explanation orderings, and also for the Horn 
restriction. These results have been proved using the technique of represen- 
tative equivalence [LibOl]; since reductions are from a problem to itself, they 
prove that a problem is "compilability-hard" for any class for which it is 
hard. In other words, we did not prove that a problem is hard for some class, 
but rather that it complexity decreases thanks to preprocessing. Using these 
"self-reductions" allows for proving such a result even if the complexity of the 
problem is not known. For example, we prove that a preprocessing step does 
not simplify the problem of finding a minimal explanation for any ordering 
that is both meaningful and irredundant. The complexity of this problem is 
not known for all such orderings; moreover, it depends on the ordering itself. 

The technique we used to prove that "preprocessing does not simplify 
abduction", being based on complexity classes at last, should however not 
be considered as implying that preprocessing is not useful for speeding up 
solving of abduction problems. Indeed, as for any result based on the theory 
of NP-completeness, this conclusion only holds as a worst-case result. In 
other words, it does not tell that no instance can ever by made simpler by 
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preprocessing, but simply that any preprocessing procedure necessarily has 
some hard instances that are not simplified. In a sense, our result is more 
positive than it appears, as it tells that a worst-case exponential on-line 
algorithm is reasonable, given than no worst-case polynomial one exists. 

Compilability results based on hardness and reductions have consequences 
similar to complexity results based on the theory of NP-completeness: they 
tell that, since no worst-case polynomial algorithm can solve the problem, 
alternative directions have to be considered. Approximation is one example: 
the preprocessing phase may result in some data structure that allows a bet- 
ter (or faster) approximation of the best abductive explanations. Another 
possible direction is that of incomplete compilation, in which the preprocess- 
ing phase produces a result that is only useful in some cases, but not always. 
Another common solution to hard-to-compile problems is that of generating 
a worst-case exponential preprocessing result. This approach is especially 
useful if part of the result can be used, as we can then try to generate it and 
use only the part we can store. All these alternative approaches, however, 
only make sense when the impossibility of preprocessing the problem into 
a polynomial problem has been proved. This is the practical impact of our 
hardness results. 

Finally, compilability has been proved to be related to expressibility of 
logical formalisms, that is, their ability of representing information in little 
space [CDLSOO]. Logical-based abduction formalisms could then be char- 
acterized by the set of abductive problems they are able to express. Com- 
pilation classes (and not complexity ones) have been proved useful to this 
aim. 
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